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1 Introduction 


Vector quantization (VQ) has received much attention and is a powerful and effective 
technique for image compression [9]. A motivation for this approach is that the 
performance of vector quantizers can approach the distortion-rate bound D(R) as the 
vector size becomes sufficiently large [3]. However, the rate at which the performance 
of VQ approaches the bound D(R) as a function of increasing vector size is rather slow 
[3]. Moreover, both the computation and memory requirements associated with VQ 
increase exponentially as the vector size increases. Therefore, relatively small vectors, 
typically of size 4x4, are usually used in the design of unconstrained exhaustive search 
VQ codebooks for image coding. 

Reducing the large complexity and memory requirements of VQ has been the fo- 
cus of much research. Various imposed structural constraints have been considered, 
but such constraints generally lead to reduced performance for a given rate and di- 
mension. However, the reduction in complexity obtained is often a good trade for the 
moderate loss in quality. Some examples of structured vector quantizers are lattice 
VQ [7], hierarchical VQ [27], and tree-searched VQ (TSVQ) [4, 10]. Residual vector 
quantization (RVQ) or multistage VQ is one such structured vector quantizer whose 
structure reduces both the memory and computation costs, and is able to operate over 
a large range of bit rates and vector sizes. The recent interest in RVQ is due largely 
to its good complexity /performance tradeoffs, and to the recent advances made in 
design methodology, which have resulted in noticeable improvements over previous 
design methods [14, 26]. 

The structural constraints of RVQ result in a performance degradation compared 
to an unconstrained VQ with the same bit rate and vector size. This degradation 
can be attributed to two factors. First, the RVQ decoder is constrained by a direct- 
sum codebook structure where all possible output vectors of the RV Q are formed by 
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the sum of stage code vectors — this set is called the direct-sum codebook. Second, 
the encoder typically employs an efficient sequential stage-wise search procedure for 
practical reasons. However, entanglements in the RVQ tree tend to reduce encoding 
accuracy when fast searching is performed. This difficulty is obviated by exhaustive 
searching or other forms of optimal sequential searching (see [1]) but the price paid 
in computational complexity is generally enormous. 

Looking beyond this, however, the structure of RVQ has properties that make it 
attractive. The multi-stage structure can be exploited to produce variable number- 
of-stages RVQ (one form of variable rate RVQ), which was shown in [19, 20] to lead 
to improvements in performance over fixed rate RVQ. In addition, the direct-sum 
structural constraint usually leads to an RVQ output entropy which is much smaller 
than the logarithm of the number of direct-sum code vectors. Experimental evidence 
suggests that the decrease in output entropy compensates for the increase in average 
distortion which, in turn, leads to a very competitive coding system. 

A simple approach to constructing another form of variable rate RVQ is to combine 
a fixed rate RVQ with a noiseless coder. However, a better approach is to directly 
incorporate entropy coding in the design process. The joint optimization of a VQ 
and an entropy coder was shown to lead to a significant improvement in performance 
for the conventional VQ case [5, 6]. This motivates the investigation of an RVQ 
design algorithm that minimizes the average distortion subject to a constraint on the 
output entropy of the RVQ. This paper introduces a new entropy-constrained RVQ 
(EC-RVQ) design algorithm that is very effective in designing variable rate RVQ 
codebooks. EC-RVQ is shown to be capable of outperforming conventional EC-VQ 
in terms of computational complexity, memory requirements and coding quality, and 
has the ability to operate over a much wider range of bit rates and vector sizes. 

To set the mathematical notation and terminology used throughout this paper, 
the next section begins with a brief summary of fixed rate RVQ. To lay the founda- 
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tion for the discussion of variable rate RVQ and the development of the new EC-RVQ 
design algorithm, necessary conditions for the optimality of fixed rate RVQ and cor- 
responding design algorithms are also discussed in the section as well. Next, methods 
of constructing three forms of variable rate RVQs are discussed and compared in Sec- 
tion 3. Necessary conditions for the optimality of variable rate RVQ are presented, 
and a discussion of the new EC-RVQ algorithm is considered in Section 4. Section 5 
discusses the performance of EC-RVQ when used in image coding applications. The 
paper concludes with some general comments on improving EC-RVQ performance 
that reflect work presently under study. 

2 Fixed Rate RVQ 

Residual vector quantization (RVQ) or multistage VQ consists of a cascade of VQ 
stages, each operating on the “residual” of the previous stage. A block diagram of a P- 
stage RVQ is given in Figure 1 for illustration. A general RVQ consisting of P stages 
(with N{ vectors in the ith stage) is capable of uniquely representing N = ]l£=i W 
vectors with only code vectors required for storage. Thus, the RVQ achieves 

tremendous savings over unconstrained VQ in terms of memory requirements, and 
may also achieve similar savings in computations. 

To establish the notation and review the key points for optimal fixed rate RVQ, 
let xi be a realization of the random fc-dimensional vector X\ described by the prob- 
ability density function (pdf) fxS Xl ) on ^ an< ^ 358111116 this to be the input to the 
P-stage RVQ shown in Figure 1. For the pth stage VQ with 1 < p < P, let us define 
the following symbols: 
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Jp indexes {0, 1, Np-1} 

Np code vectors { yp (jp) ; jp=0, 1, .... Np-1 } 


Figure 1: A P - stage residual vector quantizer 

N p the pth stage codebook size (number of codebook vectors) 

j p the pth stage index: {0 < j p < N p — 1} 

J p the pth set of all possible values for j p : i.e. {0, 1, 2, . . . , -/V p - 1} 

y p (jp) the jpth code vector 

Sp(jp) the j p th partition cell 

Vp(jp) the jpth conditional-stage residual cell 

C p the pth stage codebook {y p {j P ) • j P € Jp) 

V v the pth stage partition {S p (j p ) : j p € J v } 

Q p the pth stage quantizer mapping 

Associated with a P-stage RVQ is an equivalent single-stage direct-sum VQ. The 
direct-sum VQ and RVQ are identical in the sense that they produce the same repre- 
sentation of the source output and they have the same expected distortion. For the 
direct-sum VQ, let us define the following symbols: 

N direct-sum codebook size (N = IliLi M) 

J direct-sum P-tuple index set J\ x J 2 x • • * x Jp 

j a P-tuple index in J 
y(j) j th direct-sum code vector 

V(j) j th direct-sum partition cell 
C direct-sum codebook {y(j) : j € J} 

P direct-sum partition {V(j) : j € J} 

Q direct-sum mapping Q(x 1 ) = 1 Q P {x p ) 
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The direct-sum codebook contains all possible ordered sums of the stage code vec- 
tors, i.e., C = Ci 0 C 2 0 . . . 0 Cp. The direct-sum code vectors are given by 
y(j) = Yjpzzi y P tip)i where j p is the pth member of the ordered P-tuple index j. 
The direct-sum VQ quantizes the source vector X\ and outputs the representation 
®i = Q(xi) given by Q(x i) = Ep=i Qp( x p), where we call x p = *i ”E?=i Qi( x i) the 
pth stage causal residual The term causal refers to the stages supporting the com- 
putation of the residual; i.e., the stage residuals are computed sequentially starting 
from the first stage to the pth stage. 

To formalize this notion, let the distortion that results from representing the input 
Xi by the quantized output Xi be expressed by d(x i,*i). The distortion measure 
d(x, y ) is assumed to be a non-negative real- valued function that satisfies the following 
requirements: 

1. For any fixed x € 5ft*, d(x, y) is a continuously differentiable function of y G 5ft*. 

2. d(x,y) is translationally invariant. 

3. For any fixed x € 5ft*, d(x, y) is a strictly convex function of y, that is, Vy 1? y 2 € 
&*andA € (0, 1), d(x, Xy l + (1 - A)y 2 ) < A d{x,y l ) + (1 - A )d(x,y 2 ). 

A P-stage RVQ is said to be optimal if it gives at least a locally minimum value of the 
average distortion. There are two necessary conditions for the optimality of fixed rate 
RVQ [1, 2]. First, the encoder must map the input vectors according to the following 
nearest-neighbor rule: 

Xi € V*(j) if and only if d(*i,y(j)) < d(xi,y(fc)) for all k € J. (1) 
Second, the stage code vectors y p (j p ) at the pth stage must satisfy [2, 23] 

/ d ( V y'ptip)) fT r \j,h r )d~1 v = jnf t J d{f p , «)/r P b P (7p)d7 p < oo (2) 
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where 7 = X\ — 2 /* 0 *) is a realization of the conditional- stage residual random 

'*p 

vector T p , and the pdf /r p |j p (7 p ) is related to the source pdf fxS'^ accordin S to 


, , . 1 +•'<>) 

/rpbp 7p pr ( 7 , 6 V P U P )) 


( 3 ) 


where /? p (j) = (ji,j 2 , • • • ,j P -i>jp+i> • • Jp)> 0(A>(i)) “ ^pOp) c ^ 

*^p 

is the set of all indices j = (fci, &2, . . . , fc p _i,j p , fc p +i, . . . , Arp) such that j v G «/ p , 
and I[V(j)} is an indicator function for the direct-sum partition cell V(jf), that 
is, J[V(j)] = 1 if Xi G V(j) and Z[V(j)] = 0 otherwise. The y p (i P )’s which sat- 
isfy equation (2) are generalized centroids of conditional-stage residual vectors (i.e., 
residual vectors formed from the encodings of all prior and subsequent RVQ stages). 
Hence, the second condition will be referred to as the conditional-stage residual cen- 
troid condition hereafter. A mathematical derivation of these two conditions is given 
in [2, 23]. 


2.1 The Fixed Rate RVQ Design Algorithm 

The fixed rate RVQ design algorithm, introduced in [1], attempts to optimize all stage 
codebooks jointly to minimize the reconstruction error over all training data subject 
to a constraint on the number of direct-sum code vectors. Assuming that all stage 
codebooks are held fixed, optimization of the encoder implies that each training set 
vector is mapped to its closest direct-sum code vector using the nearest-neighbor rule 
(1). In general, this can be accomplished by exhaustively searching the direct sum 
codebook. However, this technique typically carries sufficient computational overhead 
to be unattractive. An alternative approach is to sequentially search the RVQ stage 
codebooks. This technique results in an increase in speed, but unfortunately leads 
to a significant degradation in performance since optimal code vector selection in the 
direct-sum codebook is no longer guaranteed. To address this issue, the Af-search 
technique was explored and was shown to be very efficient when used to search the 
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RVQ tree [1, 18]. Small improvements can be obtained by simply using Af- search 
when encoding the input using a sequentially-designed RVQ codebook. However, 
better results can be obtained by directly incorporating the Af-search in the RVQ 
design as well as in the encoder [2, 18]. An additional gain can be achieved for the 
same complexity by allowing the value of Af to be larger in some stages of the RV Q 
and smaller in others. This can be done by first defining a desired level for the average 
number of Af-search computations. Using a large training set, the best value of Af 
for each stage can be determined empirically such that the total number of Af-search 
computations is within the pre-specified tolerance. 

Given a fixed direct-sum partition, the fixed rate RVQ design method used in 
this work is simply an iterative Gauss-Seidel algorithm that jointly optimizes the 
stage codebooks by successively operating on each RVQ stage while holding fixed all 
other stage codebooks. At each stage optimization step, code vectors are found that 
simultaneously satisfy the conditional-stage residual centroid condition (2). Assuming 
that the squared error distortion measure is used, each “decoder-only” iteration will 
update the stage codebooks such that the average distortion will either be reduced or 
left unchanged [24]. Using theorems in [11], it can be shown [24] that if the encoder 
yields a Voronoi partition with respect to the direct-sum codebook, then the fixed 
rate RVQ design algorithm converges monotonically to a fixed point which satisfies 
the necessary conditions (1) and (2) for minimum squared error distortion. 

This proven convergence behavior is based on an exhaustive search encoder, which 
is not realistic for a practical system in general. For practical applications, a sequen- 
tial nearest neighbor or an Af-search encoder is used. In these cases, the encoder 
optimization step may actually increase the average distortion and monotonic con- 
vergence cannot be guaranteed. However, experimental results have shown that the 
sequential-search RVQ design algorithm effectively reduces the average distortion with 
only occasional deviations from monotonicity. Furthermore, in all our experiments, 
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the M- search RVQ design algorithm converged monotonically to a local minimum, 
even when relatively small values of M (such as 2 or 3) were used. 

2.2 Comments on RVQ Performance 

An upper bound on the performance of fixed rate RVQ is the performance of exhaustive- 
search VQ [26]. For the same bit rate and vector size (i.e., same number of code 
vectors), the average distortion introduced by the RVQ can be shown to be generally 
larger than that introduced by an unconstrained VQ. For example, let’s assume that 
a conventional VQ and an RVQ have the same fixed partition of 3ft* . It is shown in 
[11] that the average distortion can be minimized if and only if the code vectors are 
selected as the centroids of their respective partition cells. Since the code vectors 
in the conventional VQ codebook are structurally independent, this selection can be 
done separately for each partition cell. However, code vectors formed by direct-sums 
of stage code vectors are structurally dependent and hence it is unlikely all will be 
centroids of their respective direct-sum partition cells. As a matter of fact, these 
direct-sum code vectors are not guaranteed to even lie within their respective cells. 
Therefore, the average distortion of RVQ is higher than that of conventional VQ. 

However, by using large vector sizes and multi-path searching, the RVQ perfor- 
mance is shown to exceed that of conventional VQ with only a fraction of the compu- 
tation and memory requirements [18]. Moreover, the direct-sum codebook constraint 
usually leads to an output entropy H that is smaller than that of unconstrained VQ. 
This can be easily demonstrated using the fact that the joint entropy of a collection 
of sources (or random variables) is less than or equal to the sum of the entropies of 
the individual sources [8]. That is, given P random variables X \, . . . , Xp, 

H(X u ...,X p )<'£H(X p ). 

r = i 

Given the set of P-tuple indices J, one can uniquely index all the code vectors in 
an unconstrained VQ codebook (which has the same number of code vectors as the 
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direct-sum RVQ codebook) by the mapping 7 : J\ X . . . x Jpt-+ J, where 

i(ju---jp ) = n i j *i 

p=l fc=0 

where | Jo| = 1 and | J*| is the size of the set J*. Since the ji, . . . are independent 
(they are chosen arbitrarily), the output entropy of the unconstrained VQ is 

H(J) = '£H(J P ), 

P=1 

where H(J P ) is the entropy of J p . One can also use the same indexing scheme to 
index the direct-sum codebook, except that now j p denotes the index for the pth 
stage of the RVQ. As noted earlier, the RVQ stages are related by the direct-sum 
structure, and are not independent. Thus, Hrvq(J) = #(«7i> • • • » Jp) < 

J2 p =i H(J P ) = Hvq(J)- Notice that this result can also be obtained by using the fact 
that the entropy of the collection of the random variables «7i, Jp is equal to 

the sum of the conditional entropies, i.e., 

p 

H(J\, J 21 • • •» Jp) = H{J p \J p -i, — , «/i). 

p=i 

The previous results suggest that the direct-sum codebook constraints can gen- 
erally be expected to lead to both an increased average distortion and a decreased 
output entropy. This implies that for a given average bit rate, variable rate RVQ 
could conceivably have the potential to be competitive with variable rate VQ. 

3 Variable Rate RVQ 

For a given vector size fc, variable rate VQ implementations are those that, if prop- 
erly designed, can operate at bit rates close to the ones given by the kth order rate- 
distortion curve Rk(D) of the input. There are several ways in which a variable rate 
RVQ can be constructed. As reported in [19], a variable rate implementation can be 
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achieved by exploiting the inherent multi-stage structure of RVQ. Since each stage 
contributes independently to the total bit rate, variable rate coding can be achieved 
easily by truncating the number of RVQ stages used for a given source vector. For 
each input vector, the encoding terminates once the distortion falls below a prescribed 
threshold. Clearly, the encoder and the decoder must both have knowledge of the 
number of stages (bit rate) used to encode a given vector. Sending such a rate to 
the decoder is usually done by sending side information, which can be very costly. 
However, when relatively large vector sizes (such as 8 x 8 or 16 x 16) are used, side 
information requires only a small fraction of the total bit rate [19], This variable rate 
technique has two advantages: 1) Incorporating such a technique into the RVQ design 
algorithm leads to reduced encoding complexity because fewer distortion calculations 
are needed to encode vectors with low variances, and the centroid computation re- 
quires fewer additions; and 2) variable rate RVQ of this type tends to allow for a 
better match to the statistics of images. A large number of bits can be used to en- 
code edge vectors while a small number can be used to encode low variance vectors 
[19]. 

Another approach to variable rate RVQ is to entropy code the RVQ output indices. 
In this case, a fixed rate RVQ is combined with a variable rate lossless coder (such 
as a Huffman coder). This can be done by considering the RVQ direct-sum code 
vectors to be symbols in an extended source alphabet and constructing a variable 
length lossless code for them. The complicated interdependencies among the stages 
of an RVQ often results in a direct-sum codebook where the code vectors have a 
very nonuniform probability distribution. Therefore, the output entropy of RVQ is 
usually much smaller than the logarithm of the number of direct-sum code vectors. 
Experimental results, reported in [20], show that the output entropy of the direct-sum 
codebook is much smaller than that of the unconstrained VQ codebook (for the same 
number of code vectors). Thus the RVQ/entropy coder combination can lead to a 
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substantially lower average bit rate while maintaining the same performance level of 
a fixed rate RVQ. 

A superior approach to the variable rate RVQ implementation described above 
is one in which all code vectors and codewords are optimized with respect to each 
other. Therefore, the natural design problem for entropy-based RVQ is to find a 
direct-sum codebook whose vectors minimize the average reconstruction error over 
all training set data subject to a constraint on the output entropy of the RVQ. In 
the next section, necessary conditions for the optimality of variable rate RVQ are 
presented, an entropy constrained RVQ (EC-RVQ) design algorithm which satisfies 
these conditions is introduced, and the performance of this algorithm is demonstrated 
and discussed. 

4 Entropy-Constrained RVQ 

The high level structure of the EC-RVQ is illustrated in Figure 2. It consists of a P- 
stage RVQ where the stage codewords are input to a mapping operator. The mapping 
operator transforms the direct-sum index j = • • • >jp) codeword into a variable 

length codeword c(j) that is then used as the representation of the compressed data. 
The mapping operator can be an entropy coder or a collection of stage entropy coders. 
The idea underlying the entropy mapping operation is that j’s that occur very often 
are represented with short codewords and j y s that occur infrequently are represented 
with longer codewords such that the average bit rate is reduced. 

4.1 Necessary Conditions for Optimal Variable Rate RVQ 

For the direct-sum VQ, let J be the set of variable length indices {c(j),j £ J }. The 
direct-sum VQ, Q : C, quantizes the source vector and outputs Q(a?i), and 

may be realized by a composition of a variable length encoder mapping £ :%l k J 


11 



TO CHANNEL 


J = {1, 

INPUT 



Jp indexes {0,1 Np-1} 

Np code vectors { yp fiP) ; Jp=0* — » Np** 1 ) 


Figure 2: The EC-RVQ Structure 

where 

£(x i) = c(j) if and only if Xi G V(j ), 
and a variable length decoder mapping V : J »-► C where 

z>Mi)) = y{i)- 

The variable length encoder can be further decomposed into two mappings, £ = LoE, 
where E : $t k J and L : J h J, and o denotes composition. Similarly, one can 
decompose the variable length decoder into two mappings, D = Do IT 1 , where 
L -1 : J *-► J, and D : J h C. Note that the mapping L is an invertible mapping 
with inverse L - * 1 . 

Let *1 be a realization of the random A>dimensional vector Xi described by the 
probability density function (pdf) on ^ k • Also, let the distortion that re- 

sults from representing X\ with X\ be expressed by d(x i, *i). The distortion measure 
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<f(aj,y) is assumed to be a non-negative real valued function that satisfies the re- 
quirements (l)-(3) in Section 2. A variable rate P-stage RVQ (with an average rate 
< R ) is said to be optimal for /j^O) if it gives a locally or globally minimum value 
of the average distortion. The design problem can be stated as follows: Choose the 
codebook C7, partition P and mapping L that minimize the Lagrangian 

Ja(E,L,D) = E {d{x u xi) + A |L(j)|} (4) 

where A is the Lagrange multiplier and |L(j)| denotes the length of L (j). 

There are three necessary conditions for the optimality of variable rate RVQ [23]. 
First, the encoder must map the input vectors according to the following nearest- 
neighbor encoding rule: 

*i G r (j) iff d(x u y{j ))+ A |L(j)| < d(x u y(k)) + \ |L(fc)| for all € J. (5) 

Second, the mapping L must be one that minimizes the expected codeword length, 
R = £j e J \L{j)\ pr(j), where pr(j) = pr(* a € V(j)). Setting the codeword length 
|L(j)| to 

|I/(j)| = -log 2 pr(j) - — log 2 P r (ii,i 2 ) • • • ijp) (6) 

results in an average rate which is equal to the output entropy of the direct sum RVQ. 
Third, the stage code vectors y p (j p ) at the pth stage must satisfy the conditional- 
stage residual centroid condition (2). A complete derivation of these conditions is 
involved and may be found in [23]. 

The probability prQ'i, j 2 , . . . , jp) of a path in the RVQ can also be written as the 
product of conditional probabilities, i.e. 

pr(ji,j 2 , . . • ,jp) = pr(/p|ip-i, • ■ • , ji) P r 0>-i lip- 2 , . • • , ji) • • • prOalii) P T Ui) 

Therefore, 

|L*(j)| = - log 2 pr(j P \jp~iy . . • ,ji) - log 2 pr(jp_i|jp_ 2 , . • • ,ji) 

-. . . - log 2 pr(j 2 |ji) - log 2 pr(ij) 
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and 


p 

Jp) — ^2 ^ («4>l Jp-i? • • •» «^i)* 

p=i 

4.2 The EC-RVQ Design Algorithm 

The EC-RVQ design algorithm is an iterative descent algorithm similar to the one 
used for the design of EC-VQ codebooks. Each iteration consists of applying the 
transformation 

(E (< + l),L(i + l),D(i + 1)) = r(E(t),L(i),D(t)) 

where 

E(* + 1) = arg min(E, L(t), D (<)) (optimum partitions) 

E 

L (t + 1) = arg min(E(t + 1),L,D(Q) (optimum codeword lengths) 

D(t + 1) = arg min(E(£ + l),L(t + 1),D) (optimum code vectors) 

Following the lines of argument of [5], one can show [24] that every limit point of 
the sequence (E(/), L (<), D(Q), t = 0, 1 , , generated by the transformation T min- 
imizes the Lagrangian Ja(E,L,D) (as given by (4)). Therefore, the EC-RVQ design 
algorithm is guaranteed to converge to a local minimum. 

To find the entire convex hull of the operational rate-distortion curve, the min- 
imization of J*(E,L,D) is repeated for various A’s. Starting with A = 0 (which 
corresponds to the RVQ codebook designed by the fixed rate RVQ design algorithm), 
the EC-RVQ design algorithm uses a pre-determined sequence of A’s [5] to design 
variable rate EC-RVQ codebooks. A summary of the algorithm is given in Figure 3. 

As in the design of fixed rate RVQ codebooks, multipath searching is used in 
the encoder optimization step of the EC-RVQ design algorithm to closely satisfy the 
encoding rule given by (5). The A/-search algorithm is found to be very efficient in 
substantially reducing the encoding complexity of EC-RVQ for only a small loss in 
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Figure 3: The EC-RVQ design algorithm 
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performance. Also, the Gauss-Seidel algorithm is used to find optimal stage code vec- 
tors (i.e., stage code vectors that simultaneously satisfy the conditional-stage residual 
centroid condition (2)). 

Unique to EC-EVQ is optimization of the lengths of the codewords which rep- 
resent direct-sum partition cells or code vectors. Allowing the use of non-integer 
codeword lengths, the self-information of a P-tuple index (or random variable) j = 
(ji,j 2 , . . . ,jp) , given by (6), is essentially the optimal length of the variable length 
codeword associated with that index j. Equation (7) shows that such an optimal 
length is also the sum of P stage conditional self-information components. Because 
of the dependencies that usually exist between the stages of the RVQ, observations of 
past encoding decisions provides some partial information about the pth stage index 
j p . While the estimation problem is difficult, one can still find a good estimate of the 
lengths of variable-length stage codewords by using a sufficiently large training set. 

It is evident that the aggregate number of tables of conditional- stage entropy codes 
can become extremely large as the number of stages increases, and consequently the 
storage requirements for the entropy tables may very well offset the memory sav- 
ings obtained by using RVQ, especially when the bit rate and/or the vector size is 
large. For example, consider the design of EC-RVQ codebooks where each direct-sum 
codebook contains 10 stages with 4 code vectors/stage. Surprisingly, more than 4 
million (4 -f 4 2 + . . . + 4 10 ) scalar memory locations are needed to store the tables of 
conditional-stage entropy codes (for each direct-sum codebook). However, the num- 
ber of tables can be made very small by limiting the number of previous stages upon 
which the conditioning is based. This can be accomplished by making a Markov- 
like assumption and using conditional probabilities which depend only on the last 
m ( m < p — 1) stages. In other words, the direct-sum codeword length |L(ji)| is 
approximated by 
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|L(jf)| m = - log 2 pr(jp|j/>_l, . . . fjp—m) - log 2 pr(jp_i|jp_2, • . • Jp-m) 

- log 2 pr(j 2 |ii) - log 2 pr(ii) (7) 

Obviously, since H(J p \J p -\, . . Ji) < H(J p \J p -\,. . J p - m ) for each p = 1,2, . ..,P 
and m < p— 1 , it is easy to show that H m (J) = £p=i H(J p \J p -i, .. . y J p -m) > H(J). 
When m is small, the memory requirements to store these tables are relatively small, 
but (of course) the performance of the associated EC-RVQ is also not as good as that 
of the (P — l)th order (m = P — 1) EC-RVQ. 

It is of particular interest to find the performance gain as a function of m, which 
will help us assess how large a value of m is needed such that satisfactory performance 
is obtained. Such performance gain (as a function of m) can be estimated empirically. 
Figure 4 shows that the performance gain obtained when m is increased, ascends 
rapidly to the optimal and often saturates for very small values of m. Using small 
values for m has the advantage that the memory requirements can be substantially 
reduced. 


5 Performance of EC-RVQ 


In this section, experimental results are used to compare the performance and com- 
plexity of EC-RVQ with those of EC-VQ over a wide range of bit rates and vector 
sizes. The training set consists of six (512 x 512, 8-bit) monochrome images taken 
from the USC database. Shifts and rotations are used to generate additional training 
vectors, leading to more than 200,000 4 x 4 vectors and more than 500,000 8 x8 
vectors. The image Lena, shown in Figure 5, is used for testing, and was not included 
in the training set. In all experiments, the objective performance measure used is the 


peak signal-to-quantization noise ratio (PSNR) defined by 
PSNR - — lOlogjo (at(255) 2 
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Figure 4: The rate-distortion performance of EC-RVQ (with 4 stages and 16 4 x 4 
vectors/stage) for the test image Lena at increasing values of m. 

where N x N is the size of the image (assumed to be squared) and x(i,j) and x(i,j) 
represent the original and coded values (respectively) of the pixel at the ith row and 
the jth column of the image. 

EC-RVQ systems based on 4 X 4 vectors were investigated first where the EC- 
RVQ design algorithm with M = 4 and m = 1 was used to design a sequence of 
variable rate RVQ codebooks. Each codebook contained 4 stage codebooks of size 16, 
leading to a peak encoding rate of 1.0 bit per pixel (bpp). Likewise, the conventional 
EC-VQ algorithm was used to design a sequence of codebooks of size 2 12 = 4096. 
Although a moderate peak bit rate (i.e., 0.75 bpp) was used in the design of EC-VQ 
codebooks, the design process required well over two months of CPU time on a Sun 
4 Sparc station. Figure 6 compares the distortion versus rate performance on Lena, 
while Figure 7 compares the encoding complexity and the memory requirements for 
the EC-VQ and EC-RVQ. Table 1 shows PSNR comparisons of EC-RVQ and EC-VQ 
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Figure 5: The original image Lena at 8 bits/per pixel 
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Figure 6: The rate-distortion performance of EC-RVQ (top) and EC-VQ (bottom) 
for the test image Lena. The vector size is 4 x 4. 

for four images (taken from the USC database) all at an output bit rate of 0.40 
bits/per pixel. EC-RVQ clearly outperforms EC-VQ in PSNR performance, encoding 
complexity and memory requirements. An important factor influencing the gain of 
EC-RVQ over EC-VQ is the very large number of direct-sum code vectors that EC- 
RVQ makes available, even while maintaining a low average encoding rate. The 
EC-VQ has a very limited codebook size (due to storage and search constraints), and 
the size constraint is not inactive as the theory requires. 

In the second set of experiments, 4x4 vectors were used in the design of variable 
rate EC-RVQ codebooks with M = 4 and m = 2. The EC-RVQ codebooks contained 
7 stage codebooks each with 16 code vectors, leading to a peak encoding rate of 1.75 
bpp. Figure 8 show the PSNR performance for the EC-RVQ (at two different peak 
bit rates) for the test image Lena. As expected, EC-RVQ performance improves with 
increased peak bit rate, in spite of maintaining the same average output bit rate. It 
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NUMBER OF VECTOR DISTORTION CALCULATIONS 




Figure 7: The encoding complexity (top figure) and memory requirements (bottom 
figure) of EC-VQ (top) and EC-RVQ (bottom) for the test image Lena. 
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EC-VQ, peak=0.75 bpp 

EC-RVQ, peak=1.00 bpp 

Lena 

30.97 

31.27 

Boat 

29.63 

30.21 

Peppers 

31.03 

31.32 

Tiffany 

30.08 

30.29 


Table 1: PSNR of EC-RVQ and EC-VQ for four images taken from the USC database. 
The bit rate is 0.40 bpp. The vector size is 4 X 4. 


AVERAGE ENCODING RATE (BPP) 

Figure 8: The rate-distortion performance of EC-RVQ for the test image Lena at two 
different peak bit rates (1.75 bpp for top, 1.00 bpp for bottom). The vector size is 
: x 
















is noteworthy that EC-VQ based on high peak bit rates is not practical in general 
because of the large memory and complexity associated with the encoding and design 
procedures. 

In the last set of experiments, 8x8 vector sizes were used in the design of EC- 
EVQ with M = 4 and m — 2 and with 7 stages codebooks of size 16. The maximum 
bit rate is then 0.4375 bpp. Figure 9 shows the coded image Lena at an average 
encoding rate of 0.1505 bpp. The PSNR is about 30 dB, and the subjective quality 
is rather good for a compression ratio of about 50 : 1. Practical EC-VQ systems 
are limited to relatively small vector sizes (typically 4x4) due to the exorbitant 
encoding and memory demands needed to implement such quantizers. While the EC- 
RVQ coding results (for 8x8 vectors) at such low bit rates cannot be compared with 
those of EC-VQ, they appear to be almost as good as those of more complex hybrid 
Subband/EC-RVQ/entropy coders reported in [21, 22]. 

6 Closing Remarks 

The entropy-constrained RVQ introduced in this paper has many attractive features 
for data compression. In particular, the performance quality is among the best avail- 
able to date. There also appear to be several areas where improvement can be made. 
One in particular is the entropy coding employed in the design algorithm. Equation 
(6) assumes the use of codewords that have non-integer lengths, and results in an 
average rate which is exactly equal to the output entropy of the EC-RVQ codebook. 
One can also employ (during the EC-RVQ design) an entropy coding algorithm of 
the entropy code that would follow the EC-RVQ. When employing a Huffman coding 
algorithm, both alternatives produced overall EC-VQ systems with nearly identical 
performance [5]. However, this may not be true in the case of EC-RVQ because the 
tables of conditional probabilities are usually very small (e.g. 4,8,16), and the av- 
erage lengths of the corresponding entropy codes may not be as close to the output 
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Figure 9: The image Lena coded at 0.1505 bpp. The vector size is 8 x 8. The PSNR 
is 30.05 dB. 
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entropy. Therefore, incorporating a Huffman encoder into the EC-RVQ design algo- 
rithm may lead to a significant increase in performance when that entropy coder is 
used to encode the RVQ stage indices. Another important issue that relates to the 
entropy coding problem is the fact that, since the conditional probability distributions 
of the latter stages are usually very skewed, other entropy coding techniques (such as 
arithmetic coding) may perform better than Huffman-based techniques. Experiments 
where both a Huffman encoder and an arithmetic encoder are separately incorporated 
into the EC-RVQ design algorithm are presently being investigated. 

Another possible area for improvement is in the entropy coding structure. The 
present design algorithm is based on static entropy codes. However, with both the size 
and the number of the tables being relatively small, the possibility of adaptive entropy 
coding exists. This is another variation of the system presently being investigated. 

Finally, we point out that EC-RVQ has some potential advantages in terms of 
channel insensitivity characteristics. Fixed rate RVQ tends to be less sensitive to 
channel errors than conventional VQ. A bit error could be disastrous for a conven- 
tional VQ, but is usually less serious when an RVQ is used. This nice property of 
RVQ seems to be lost when an EC-RVQ is used because a bit error in one of the 
stage codewords will very likely propagate through the subsequent stages, and will 
prevent the RVQ decoder from correctly decoding the variable length codewords of 
the remaining stages. However, the EC-RVQ variable length codewords become less 
sensitive to channel errors if we were to protect those variable length codewords of 
the first few stages. 
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